首页> 外文OA文献 >Specious rules: an efficient and effective unifying method for removing misleading and uninformative patterns in association rule mining
【2h】

Specious rules: an efficient and effective unifying method for removing misleading and uninformative patterns in association rule mining

机译:特殊规则:一种高效且有效的统一删除方法   关联规则挖掘中的误导和无信息模式

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We present theoretical analysis and a suite of tests and procedures foraddressing a broad class of redundant and misleading association rules we call\emph{specious rules}. Specious dependencies, also known as \emph{spurious},\emph{apparent}, or \emph{illusory associations}, refer to a well-knownphenomenon where marginal dependencies are merely products of interactions withother variables and disappear when conditioned on those variables. The most extreme example is Yule-Simpson's paradox where two variablespresent positive dependence in the marginal contingency table but negative inall partial tables defined by different levels of a confounding factor. It isaccepted wisdom that in data of any nontrivial dimensionality it is infeasibleto control for all of the exponentially many possible confounds of this nature.In this paper, we consider the problem of specious dependencies in the contextof statistical association rule mining. We define specious rules and show theyoffer a unifying framework which covers many types of previously proposedredundant or misleading association rules. After theoretical analysis, weintroduce practical algorithms for detecting and pruning out speciousassociation rules efficiently under many key goodness measures, includingmutual information and exact hypergeometric probabilities. We demonstrate thatthe procedure greatly reduces the number of associations discovered, providingan elegant and effective solution to the problem of association miningdiscovering large numbers of misleading and redundant rules.
机译:我们提供了理论分析以及一套测试和程序,用于解决我们称为\ emph {specious Rules}的大量冗余和误导性关联规则。似然依存关系,也称为\ emph {spurious},\ emph {apparent}或\ emph {illusory associations},是指一种众所周知的现象,其中边际依赖性仅仅是与其他变量相互作用的产物,当以这些变量为条件时便消失。最极端的例子是尤尔-辛普森(Yule-Simpson)悖论,其中两个变量在边际权变表中表示正相关性,而在所有混杂因素的不同级别定义的所有部分表中均具有负性。公认的智慧是,在任何非平凡的数据中,要控制这种性质的所有指数上的许多混杂是不可行的。在本文中,我们考虑了在统计关联规则挖掘中的似然依赖问题。我们定义了虚假规则,并向它们提供了一个统一的框架,该框架涵盖了许多先前提出的冗余或误导性关联规则。经过理论分析,我们引入了实用的算法,可以在互惠信息和精确的超几何概率等许多关键优度指标下有效地检测和删减似然关联规则。我们证明了该程序大大减少了发现的关联的数量,为发现大量误导和冗余规则的关联挖掘提供了一种优雅而有效的解决方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号